† Corresponding author. E-mail:
Project supported by the National Natural Science Foundation of China (Grant Nos. 61372172 and 61601518).
Accurate reconstruction from a reduced data set is highly essential for computed tomography in fast and/or low dose imaging applications. Conventional total variation (TV)-based algorithms apply the L1 norm-based penalties, which are not as efficient as Lp (
Computed tomography (CT) configured with circular scanning is widely used in many imaging applications,[1] such as medical imaging and industrial inspection. Accurate image reconstruction from sparse or few-view data is greatly important in areas needing fast imaging and/or low radiation dose.
When observation data is not quite adequate, iterative image reconstruction (IIR) algorithms, especially the optimization-based ones, are often considered. A typical reconstruction method is usually comprised of two aspects, the optimization models and the corresponding algorithms. The optimization models are often the combinations of regularization parts and data fidelity parts. The regularization term is always used to describe some properties of the intended image itself. This may be some sparse representations or de-noising functions on images.
The state-of-the-art regularization methods can be roughly divided into four categories. Firstly, among the earliest ones, the prior information-based ones such as the method proposed by Chen[2–4] have seen some valuable applications. The feature constrained compressed sensing reconstruction[5] is closely related to the above method. Secondly, the methods such as wavelet-, framelet-, or tight frame-based ones have been proposed in recent years, and some[6–8] of them show practical capabilities. These methods are based on harmonic analysis which can help the recovery of the image details for some applications. However, the selection of proper wavelets cannot be very easy, and the most appropriate form of the potential wavelet for practical applications is still an open question. Thirdly, the methods are dictionary learning-based ones. These methods[9,10] are usually training or applying an over-complete dictionary in the reconstruction procedure. However, training or getting such a dictionary possibly needs a large amount of available dataset and it may not work for all applications. Fourthly, much work regarding IIR algorithms has been focused on total variation (TV) minimization,[11–17] which exploits the sparsity of gradient magnitude image (GMI) of images. TV is the L1 norm of GMI and reconstruction by minimizing TV can reduce the amount of data acquisition when GMI is sparse in non-zero entities. Some extensions of TV, such as high-order TV,[18] total generalized variation,[19] and TV-stokes,[20] are proposed for image reconstruction and restoration. These forms in this type incorporate second-order (or higher) information which can help to improve the image quality but meet with sacrifices of efficiency.
The state-of-the-art optimization-based algorithms for reconstruction can also be roughly divided into four categories: 1) the adaptive-steepest-descent-projection-onto-convex-sets (ASD-POCS) framework,[11] 2) the Chambolle–Pock (CP) framework,[12,13,21,22] 3) the split-Bregman (SB) iteration,[23] and 4) the alternating direction minimization (ADM) method.[14,15] The ASD-POCS-type methods are well designed to solve the constrained TV minimization by incorporating the traditional POCS and the steepest descent, resulting in relatively good performance for under-sampled[24] or low dose dataset.[25–27] The CP framework and its extensions[28] provide a feasible tool for various problems raised in practical applications, and until now some valuable applications[29,30] confirm this property of CP. The SB originates from Bregman iteration and can solve a very broad class of l1-regularized problems. The first attempt of SB for image reconstruction, to the best of our knowledge, is carried out by Vandehinste[31] for sparse-view reconstruction. ADM is a more straightforward iteration scheme and has been widely applied in optimization community. The basic idea of ADM is to decouple the complex problem into several simple ones and deal with these simple ones alternatively and iteratively. Some of the most important applications of ADM are image reconstructions for magnetic resonance imaging (MRI),[32] CT,[33–36] and other x-ray-based imaging applications.[37]
It should be noted that the L1-normed-based TV can yield easy computation using shrinkage mapping[38] (also known as “soft thresholding”), which makes it easy to use in practical CT imaging applications. However, TV is not the most direct method to describe the GMI’s sparsity. It is logical to assume that a relaxation closer to the L0 quasi-norm can measure GMI sparsity better than TV, which implies that the p-th power of the Lp norm
Besides the regularization term (i.e., the TV term), the forms of the data fidelity terms are also of great importance. The data term should account for the inconsistencies of data observation. It is widely known that the acquisition of projection can be roughly seen as the random Poisson process. Compared to the conventional quadratic term based on least square (or the L2-norm-based term), the Kullback–Leibler (KL) may have better noise suppression capabilities. The Kullback–Leibler data divergence can be a proper choice in building optimization problem. This paper combines the total p-variation (TpV) plus the KL data divergence for few-view CT image reconstruction. The practical algorithms are designed by ADM, and both simulations and real CT data experiments are carried out to verify the performances.
The remainder of this study is organized as follows. Section
A typical CT system mainly consists of an x-ray source, a flat panel detector, a mechanical gantry system, and a computer-based data processing system. When modeling the imaging system, a discrete form of the x-ray transform is usually taken into consideration
Conventional TV minimization is often related to the optimization of an L1-norm-based function. With regard to the minimization of an L1-norm-based function, the proximal mappings of L1 functions are defined as
The motivation for using alternative shrinkage mappings in Eq. (
The proof of the theorem constructs g using the Legendre–Fenchel transform of an anti-derivative of s. Although g cannot be guaranteed to have a closed-form expression for the penalty function Gp, this is considered an acceptable price to pay for having an explicit proximal mapping, which is much more suitable for implementing a practical, cone-beam CT algorithm than having a closed-form penalty function. In order to have an intuitive impression of the p-shrinkage and its induced penalty functions, we can compute sp and the corresponding g numerically and some examples are plotted in Fig.
In conventional applications, the projection data fitting or fidelity term is always built using the square of L2-norm. However, for applications with different noise properties, L2-norm is not always presenting satisfying outcomes. This generates the needs for seeking an alternative of the projection data fitting term. When the data noise is an important physical factor that must be considered as a multivariate Poisson probability distribution, we then consider the Kullback–Leibler data divergence, which is based on the maximum likelihood expectation maximization method. The KL data error function form is chosen as the fidelity term which can be written as
We incorporate the KL term into the TpV model which can be written as
We introduce auxiliary variable
We optimize the above problem (
The sub-problem for
Algorithm 1: Pseudocode for K steps ADM iteration scheme for KL-TpV reconstruction. |
---|
1: |
2: |
3: |
4: |
5: |
6: |
7: |
8: |
9: |
10: |
It should be noted that there are two groups of parameters, i.e., 1) the model-parameters p, ρ1, ρ2; and 2) the algorithm-parameters ξ, τ, β1, and β2. The model-parameters have impacts on the designed solutions and different choices of the model-parameters lead to different reconstructions. Unlike the model parameters that specify feasible solutions, if a reasonable range is chosen, the algorithm parameters have no impact on the solution specification. Instead, they impact the algorithm’s convergence path and rate to the feasible solution set. In the selection for these parameters, a direct way is to try and see. For the convenience of implementation for readers, we offer a group of available choice in this paper. Here we choose β1= 0.16, β2 = 1.0, ρ1 = 0.005, ρ2 = 1.6× 104, τ = 1.0, and ξ = 1.0.
Although the KL data discrepancy applied here is not directly derived from the Poisson noise model, it has a very close relationship with the noise distribution. For the case of a normal dose of exposure, the x-ray CT photon measurements are often modeled as independent Poisson random variables with means of
Strictly speaking, formulas (
In this experiment, a fan beam geometry is set to simulate the data acquisition of CT imaging. The typical Shepp-Logan phantom with a discretization of 256×256 pixels is utilized as the standard image. This image is generated by MATLAB code phantom (‘Shepp-Logan’, 256) which returns a low contrast image, widely used by tomography researchers. The distance from the x-ray source to the iso-center is 164.86 mm and the distance from the x-ray source to the center of the line detector is 1052.00 mm. The number of detector bins is 512 with a size of 0.074 mm for each.
In this experiment, 7 projections are evenly acquired within 180° along the circular trajectory. The reconstructions are carried out using alternating direction TV minimization (ADTVM),[35] which is an L2-normed-based fidelity plus TV minimization model (also termed as L2-TV hereafter), KL-TV, and KL-TpV (with different p values), and the results for each algorithm at the iteration number of 2000 are shown in Fig.
In the following of this subsection, some more tests are carried out in order to obtain more knowledge of the proposed method. Reconstructions using only 6 views evenly acquired within 180° are performed. Different values of p are chosen within [1.0 0.90]. It should be noted that 6 views are too few to obtain accurate reconstructions, and consequently, we properly increase the number of detector bins to 640 for each view. Moreover, the iteration number for each algorithm is extended to 10000 to obtain meaningful results. Other geometric parameters are the same as those in the first experiment. It must be declared that, for reconstructions with 6 views of projections, the objective of the following investigations is to test the convergence behavior rather than merely getting high-accuracy images since the projections are too few for high-accuracy reconstructions.
Some typical results are shown in Fig.
Another important issue remains in the selection of values of p, and here we carry out a comparison between KL-TpV algorithms with different p with 7 and 6 views of projections. In the reconstructions with 7 and 6 views, the best RMSEs of each reconstruction are plotted as the functions of p in Fig.
Reconstructions with noisy projections are carried out and demonstrated in this subsection. The objective of this simulation is to test the preliminary performance between KL- and L2-norm-based data fidelity in the reconstructions. The Poisson noise is added into the simulated projections. Similar to the previous studies, the phantom we apply here is also the low contrast Shepp–Logan phantom.
In the simulations, the initial photon count is 66000 for the utilized low contrast phantom in the generation of the projection data. In order to get an expression of the noise level, the SART reconstruction with 361 views of projections acquired evenly within 360° is shown in Fig.
In this test, 9, 30, and 60 projections evenly acquired within 180° are applied to perform the reconstructions. Typical results are shown in Fig.
In this subsection, real CT verifications are carried out to test the proposed algorithm. These projections are acquired from an adult male rabbit in vivo on a CBCT imaging system. The distance from the source to the rotation center is 502.8 mm, and the distance from the source to the center of the detector is 1434.7 mm. The flat detector has 1024×768 bins with pixel size of 0.3810 mm× 0.3810 mm for each. The reconstructed image is of 512× 512× 384 with 0.2671 mm× 0.2671 mm×0.2671 mm for each voxel. The working voltage and current of the x-ray tube are 100 kV and
For iterative reconstructions, it is well known that conducting fully three-dimensional reconstructions is very time consuming because the forward- and backward-projection operations are of severely high computational complexity. In this paper, accelerations for the two operations using graphic processing unit (GPU) under NVidia CUDA framework are incorporated in the implementation. Specifically, a fast and parallel algorithm for projection developed by Gao[43] is implemented on the NVidia Tesla K40c card. With the hardware acceleration, the time consumption for each iteration loop is about 40 s.
Reconstruction from full views serves an expression of the references while the result from sparse views indicates the limitation of the analytical algorithm. In Fig.
We carry out the iterative reconstructions using 60 views of projections acquired the same as sparse-view reconstruction of FDK above. The iterative algorithms applied are the SART-TV, ADTVM (L2-TV), and the proposed KL-TpV (p = 0.92) method. For iterative algorithms under real CT projections, it is an essential problem to properly set the stop criterion. We test the KL-TpV algorithms with the real CT data and plot the relative change (in RMSE) of the current image with the previous adjacent one in Fig.
Slices on three different directions (sagittal, coronal, and transverse sections) are chosen to demonstrate the reconstruction of each algorithm. Iterative results (with 60 projections) are shown in Figs.
Although the conventional TV is an often used convex relaxation to the L0 quasi-norm of image gradient, we point out in Section
The optimization built for reconstruction applies the KL data term which shows capabilities in noise suppression over the conventional L2-norm-based methods. The KL term has a very close relationship to the TPL term which is drawn from the maximum likelihood under Poisson random noise assumption. Comparing the results in Figs.
Another essential point is that the KL-TpV reconstruction is designed using ADM rather than CP or SB. This is for the reason that the KL-TpV model is in fact a non-convex function and ADM does not restrict strictly the convexity of the objectives while CP or SB is not directly applicable for this model. Moreover, we apply the ADM for algorithm derivation which can easily incorporate the p-shrinkage operator in the implementation. Even so, we find that, by the experiments, the relatively proper range for p is (0.91 0.93), and certainly some necessary tuning on parameters can improve its performance.
This paper proposed an efficient reconstruction algorithm using total p-variation plus Kullback–Leibler data divergence. We introduce the p-shrinkage method in the algorithm which leads to an easy and concise implementation. This new method is derived by ADM and can reach promising results for very few views of projections compared with the conventional method. The experiments demonstrate the effectiveness of this method. However, this is only preliminary research and result. Further optimizations and investigations should be studied in the future.
[1] | |
[2] | |
[3] | |
[4] | |
[5] | |
[6] | |
[7] | |
[8] | |
[9] | |
[10] | |
[11] | |
[12] | |
[13] | |
[14] | |
[15] | |
[16] | |
[17] | |
[18] | |
[19] | |
[20] | |
[21] | |
[22] | |
[23] | |
[24] | |
[25] | |
[26] | |
[27] | |
[28] | |
[29] | |
[30] | |
[31] | |
[32] | |
[33] | |
[34] | |
[35] | |
[36] | |
[37] | |
[38] | |
[39] | |
[40] | |
[41] | |
[42] | |
[43] | |
[44] |